System Identification Methodology of a Gas Turbine Based on Artificial Recurrent Neural Networks

The application of identification techniques using artificial intelligence to the gas turbine (GT), whose nonlinear dynamic behavior is difficult to describe through differential equations and the laws of physics, has begun to gain importance for a little more than a decade. NARX (Nonlinear autoregressive network with exogenous inputs) is one of the models used to identify GT because it provides good results. However, existing studies need to show a systematic method to generate robust NARX models that can identify a GT with satisfactory accuracy. In this sense, a systematic method is proposed to design NARX models for identifying a GT, which consists of nine precise steps that go from identifying GT variables to obtaining the optimized NARX model. To validate the method, it was applied to a case study of a 215 MW SIEMENS TG, model SGT6-5000F, using a set of 2305 real-time series data records, obtaining a NARX model with an MSE of 1.945 × 10−5, RMSE of 0.4411% and a MAPE of 0.0643.


Introduction
For the most part, power production plants use fossil fuels or nuclear energy. However, with the depletion of fossil fuels and environmental pollution problems due to greenhouse gas emissions, the proportion of power generated from renewable energy has gradually increased to replace other power generation systems [1]. The production of energy from those that are renewable varies according to natural conditions, such as wind speed and solar radiation. For this reason, it has been studied how to overcome the problem of the intermittency of renewable energies and their storage [2]. Faced with this problem, attention is being paid to the Gas Turbine (GT), which has the fastest response among conventional power generators, which is convenient so as not to affect the stability of electrical networks [3].
A GT has a high specific power and emits much fewer pollutants because it uses natural gas as fuel. In addition, it can achieve fast starts and stops compared to other power generation systems, such as coal and nuclear power generation, and fast load-following operation. Recently developed GTs use the combustion of hydrogen and natural gas mixtures, making greener operations possible [4].
For reasonable control and monitoring of the operation of the GT, a model is required that represents with excellent approximation its real dynamic behavior. There are two approaches to modeling a GT: the white box and the black box.
White box models. The white box models of the GT describe its behavior using physical equations based on engineering principles and its dynamics. These models are used when there is sufficient knowledge about the physics of the system [5], and they can be nonlinear [6,7] or linearized [8,9].
Several investigations have been carried out in the field of identification of the behavior of a GT using white-box models, among them, for adaptive control [10], for a low-power

Operation of the GT
The GT is an internal combustion engine that uses the air's gaseous energy to convert the fuel's chemical energy into mechanical energy and works according to the Brayton cycle [23]. Figure 1 shows a typical schematic of a GT system made up of an air compressor, a combustor, and a turbine. Air at atmospheric pressure enters the compressor (1), which is increased by the compressor at its outlet (2), and enters the combustion chamber to mix with the fuel and is ignited to produce hot expanding gas that enters (3) and drives the turbine to generate mechanical energy on its axis that rotates at a certain angular speed, in order to drive electric generators, pumps, compressors, among others. Finally, the gases leave the turbine (4).
The ideal Brayton cycle consists of two isobaric and two isentropic processes [24]. The two isobaric processes consist of the combustion system of the gas turbine and the gas side of the HRSG (Heat Recovery Steam Generating). The two isentropic processes represent compression and expansion processes. The ideal Brayton cycle consists of two isobaric and two isentropic processes [24]. The two isobaric processes consist of the combustion system of the gas turbine and the gas side of the HRSG (Heat Recovery Steam Generating). The two isentropic processes represent compression and expansion processes.  The expressions for compressor work ( ), Turbine work ( ), total output work ( ), heat added to the system ( , ) and efficiency total cycle ( , ), are formulated with Equations (1) to (5), respectively [7].
where is the air mass, is the mass of fuel, is the lower heating of fuel and ℎ , ℎ , ℎ , ℎ are the enthalpies of states 1 through 4, respectively.  The ideal Brayton cycle consists of two isobaric and two isentropic processes [24]. The two isobaric processes consist of the combustion system of the gas turbine and the gas side of the HRSG (Heat Recovery Steam Generating). The two isentropic processes represent compression and expansion processes.   The expressions for compressor work ( ), Turbine work ( ), total output work ( ), heat added to the system ( , ) and efficiency total cycle ( , ), are formulated with Equations (1) to (5), respectively [7].
where is the air mass, is the mass of fuel, is the lower heating of fuel and ℎ , ℎ , ℎ , ℎ are the enthalpies of states 1 through 4, respectively. The expressions for compressor work (W c ), Turbine work (W t ), total output work (W cyc ), heat added to the system (Q 2,3 ) and efficiency total cycle (η 2,3 ), are formulated with Equations (1) to (5), respectively [7].
where . m a is the air mass, . m f is the mass of fuel, LHV f uel is the lower heating of fuel and h 1 , h 2 , h 3 , h 4 are the enthalpies of states 1 through 4, respectively.
The higher the turbine ignition pressure-temperature ratio, the greater the efficiency of the Brayton cycle. The total cycle efficiency relationship is based on the simplifying assumptions: (1) . m a . m f , (2) the gas is calorically and thermally perfect, that is, the specific heat at constant pressure (c p ) and the specific heat at constant volume (c v ) are constant, and the ratio of specific heat γ remains constant throughout the cycle, (3) the pressure ratio (r p ) of the compressor and the turbine are the same, and (4) components work at 100% efficiency. Under these assumptions, the efficiency of the ideal Brayton cycle, operating between room temperature and ignition temperature, is given by Equation (6): where r P = Pressure ratio; and γ is the ratio of the specific heats. Supposing that r P is the same in the compressor and the turbine, the following relationships hold.
with r P in the turbine.
In a real cycle, taking into account the efficiencies η c of the turbine compressor and the efficiency η t of the expander, the efficiency of the entire cycle between the ignition temperature T f and room temperature T amb of the turbine is given by Equation (9).
The maximum t"I have checked and revised all"otal efficiency of the thermal cycle, considering the optimal pressure ratio for fixed inlet temperatures and efficiencies for the compressor and the turbine, is obtained with the following relationship.
So that there are no losses in the compressor and the turbine, in Equation (10), it must be considered η c = η t = 1, with which the optimal pressure ratio is reduced to: The optimum pressure ratio for the maximum output work of a turbine, taking into account the efficiencies of the compressor and the turbine expander, is given by Equation (12).

The NARX Model
The NARX model is a type of ANN with feedback suitable for non-linear modeling systems, especially time series, that use past measurements to predict future values [25].
In Figure 3, the NARX model is illustrated, where the "System" represents a real or artificial process to be approximated, and the "Model" represents the system that allows its behavior to be simulated. NARX can be used in series-parallel mode or parallel mode. The series-parallel mode predicts one or more future steps based on past exogenous inputs (u) and outputs (y) of the system. On the contrary, in parallel mode, the model's output is considered instead of the output of the system [17]. Considering that the model output may have an error, these could negatively influence the model outputs in relation to the system, so in this work, we will consider the series-parallel mode.
The future value of the output signalŷ(t) depends on its m past values Y(t, m) and on R × n past exogenous inputs U(R, n). The function f approximates the behavior of the TG. NARX can be implemented using a feedforward neural network to approximate the function f [26].

Systematic Method to Build the NARX Model
A method for the systematic design of a NARX model with high precision for a GT is proposed, based on nine steps (Figure 4), which go from identifying variables to obtaining a GT's identification model. Each of the steps is described below:

Systematic Method to Build the NARX Model
A method for the systematic design of a NARX model with high precision for a GT is proposed, based on nine steps (Figure 4), which go from identifying variables to obtaining a GT's identification model. Each of the steps is described below:

GT Variables (P1)
The first step in any design of a model for a GT is the definition of its input and output variables. The output variables refer to the behavior of the GT to be identified, such as the angular velocity [27], the outlet temperature [20], the output power, and the thermal efficiency [28]. The input variables, also called exogenous, are the variables that influence the behavior of the GT, such as the compressor inlet temperature [10], the compressor inlet pressure, the fuel flow, and the Inlet Guide. Vane (IGV) [22]. All these variables allow for knowing the dynamic behavior of a GT. Depending on the TG to be identified, the input and output variables of the GT must be identified.

GT Variables (P1)
The first step in any design of a model for a GT is the definition of its input and output variables. The output variables refer to the behavior of the GT to be identified, such as the angular velocity [27], the outlet temperature [20], the output power, and the thermal efficiency [28]. The input variables, also called exogenous, are the variables that influence

Dataset (P2)
In this step, a dataset is determined on the input and output variables identified in step P1. The dataset can be obtained by simulation [29] or measuring the GT under study [30]. In either case, it is required that data cover one operating cycle of the GT, that the period between data recordings is constant and as small as a few minutes or seconds, and that data are reliable. The dataset, in turn, will be divided into three datasets, one for training, another for validation, and one for testing, with some authors considering proportions of 70-15-15% [16] or 60-15-25% [31], where a-b-c% are the proportions of the dataset for training (a%), validation (b%) and testing (c%).

Preprocessing (P3)
This process aims to place the dataset ready for processing, that is, to be used in the training, validation, and testing process. For this, the following activities [32,33] are carried out if the case warrants: integration of data in various repositories, cleaning, imputation, and normalization.

Variable Selection (P4)
The purpose of this step is to select the input variables that are correlated with the output variable. In this way, we can understand the behavior of the GT. For this purpose, there are various techniques, such as the Principal Component Analysis evaluator [34,35] and the Pearson correlation [36], some of them implemented in libraries such as Python, Pytorch, Keras, and in tools such as MatLab and Waikato Environment for Knowledge Analysis (Weka).

Error Metrics (P5)
The error metrics that are commonly used to evaluate and report the performance of a regression model are, among others: the mean absolute error (MAE) [28], the mean square error (MSE) [37,38], the error mean absolute percentage (MAPE) [39], and root mean square error (RMSE) [40]. Letŷ(t) be the output of the model, y(t) the output of the GT, and N the number of records to evaluate, then these metrics are determined as:

Design (P6)
The design of a NARX implies a network structure that includes all the elements for it to learn to identify a GT [41,42]. For what should be considered: input variables, output variables, an input layer, and an output layer, one or more hidden layers, number of neurons in each hidden layer, transfer functions, such as logsig, purelin, hardlim, satlin, and poslin, activation function, propagation function, training function, such as trainlm and trainbfg, and number of delays of the output signal and input signals. In addition, values for the hyperparameters, the number of epochs, the learning rate, the momentum rate, and the desired final error must be considered.
Some tools allow configuring a NARX topology quickly, such as MATLAB and Python, in their current versions.

Training and Validation (P7)
In this step, the NARX model designed in step 6 is trained using the training dataset and learning algorithms, such as backpropagation [43,44] and Q-Learning [45]. The model was validated based on cross-validation techniques. These algorithms were implemented in MATLAB [46,47], Weka [44,48], and Python, as well as its frameworks, such as Tensor Flow and Keras [49].

Fine Tuning (P8)
This step is applied only if the model obtained in step 7 presents unsatisfactory results (metrics).
Fine-tuning is the procedure that aims to determine which hyperparameter values (number of layers, number of nodes per layer, types of activation functions for each layer, initial weights, etc.) provide the model with better results in the learning process [50]. In general, fine-tuning is implemented in various tools that provide machine learning algorithms, some fine-tuning code being GridSearch or RandomizedSearch.

Testing (P9)
In this step, the successful NARX model obtained in step 7 is tested against the validation dataset, and the error metrics given in step 5 are measured. If the results of this step are satisfactory, that is, an acceptable error, then an adequate NARX model is obtained, and the process is complete; otherwise, it must return to step 6 to build a satisfactory new model. For testing, the same tools are used for training.
Satisfactory NARX model. The final model will be obtained from step 9 if the result of the metrics used is satisfactory. In this way, the model will allow the representation of a GT's behavior with the desired accuracy.
Next, a case study is developed to illustrate the application of the proposed method.

Case Study: Single Shaft Open Cycle Gas Turbine
The GT of Lima (GTL) is from 2009 and is a single-axle, heavy-duty, whose characteristics and a view of it are shown in Table 1 and Figure 5, respectively. It works five days a week, 24 h a day, Monday through Friday, and is used by a private electricity generation company in Peru. In the present case study, the GTL output variable of interest is the rotational speed of its axis. Next, the application of the proposed method step by step for the GTL is shown. The input and output variables of the GTL are shown in Table 2. Data were obtained through transmitters of all the variables, through a data logging system of the control and supervision system SPPA-T3000 control system of SIEMENS, during eight days, every 5 min, considering the cycle of intermittent starts and stops, obtaining 2305 records in Excel, which are available upon request. Figure 6 shows a sample of the dataset. It has been considered to divide the dataset for training, validation, and testing into 70%, 15%, and 15%, respectively.    Since data were correctly obtained by the SCADA data logging system, no integration, cleaning, or data imputation was required. However, to avoid biases of one variable, data were normalized, proportionally passing all data to a scale from 0 to 1. The variation of the values of the variables , (t), (t), (t), (t) and (t) before normalization is from 10 −9 to 10.879533 Kg/s, from 14.758274 to 21.932644 °C, from   Since data were correctly obtained by the SCADA data logging system, no integration, cleaning, or data imputation was required. However, to avoid biases of one variable, data were normalized, proportionally passing all data to a scale from 0 to 1. The variation of the values of the variables u 1 , u 2 (t), u 3 (t), u 4 (t), u 5 (t) and y 1 (t) before normalization is from 10 −9 to 10.879533 kg/s, from 14 For the selection of the input and output variables, the InfoGain AttributeEval program of the Weka version 3.4.8 tool was applied, through which the input variables u 4 (t) and u 5 (t) were discarded. Thus, the selected variables are u 1 (t), u 2 (t) and u 3 (t). In addition, to confirm the selected variables, the Pearson correlation coefficient was used, which shows a value greater than 0.95. Figure 7 shows part of the behavior of the normalized data for the three input variables.
For the selection of the input and output variables, the InfoGain Attribute program of the Weka version 3.4.8 tool was applied, through which the input vari ( ) and ( ) were discarded. Thus, the selected variables are ( ), ( ) and In addition, to confirm the selected variables, the Pearson correlation coefficient was u which shows a value greater than 0.95. Figure 7 shows part of the behavior o normalized data for the three input variables.   Figure 8 shows part of the spin-holds of the angular velocity, which constitute periods of rotation of the GTL, driven by a prime motor and not by gas, with the pur of its cooling. Therefore, when it keeps rotating, the fins of the GTL produce cooling. T spin-holds occur between the time intervals of 600-650 min and 1150-1200 min. 3.4272156 at 3634.8547 RPM, respectively.
For the selection of the input and output variables, the InfoGain AttributeEval program of the Weka version 3.4.8 tool was applied, through which the input variables ( ) and ( ) were discarded. Thus, the selected variables are ( ), ( ) and ( ).
In addition, to confirm the selected variables, the Pearson correlation coefficient was used, which shows a value greater than 0.95. Figure 7 shows part of the behavior of the normalized data for the three input variables.   Figure 8 shows part of the spin-holds of the angular velocity, which constitute short periods of rotation of the GTL, driven by a prime motor and not by gas, with the purpose of its cooling. Therefore, when it keeps rotating, the fins of the GTL produce cooling. These spin-holds occur between the time intervals of 600-650 min and 1150-1200 min.  Figure 8 shows part of the spin-holds of the angular velocity, which constitute short periods of rotation of the GTL, driven by a prime motor and not by gas, with the purpose of its cooling. Therefore, when it keeps rotating, the fins of the GTL produce cooling. These spin-holds occur between the time intervals of 600-650 min and 1150-1200 min.
The metrics used to evaluate the performance of the NARX models are MSE and MAPE, whose calculations are given in Equations (4) and (5), respectively, and whereŷ(t) is the model output and N is the number of records of the dataset to be evaluated.
For the design of the NARX model of the TGL, the MATLAB tool (v. 20-22a) was used with the three selected input variables and the output variable; likewise, one or more hidden layers were considered with a variation of neurons per layer, plus various backpropagation training functions, transfer functions, and one or more lags on the input and output variables. On the other hand, the number of epochs was set equal to 100, the learning rate at 0.01, the momentum rate at 0.9, and the desired error of no more than 1%.
Next, the NARX model established in the design is trained using the training dataset. An iterative process in two loops has been considered to determine the values of the hyperparameters that provide the best results for the model. The first is to tune the hyperparameters in the training and validation process (Step P7), and the second for the testing process (Step P9). In each loop, the number of neurons has been varied from 1 to 30 for each hidden layer. Tests have been carried out with different backpropagation training functions and a combination of other transfer functions for the hidden and output layers. In addition, 1 to 4 delays have been considered. The metrics used to evaluate the performance of the NARX models are MSE and MAPE, whose calculations are given in Equations (4) and (5), respectively, and where ( ) is the model output and is the number of records of the dataset to be evaluated. For the design of the NARX model of the TGL, the MATLAB tool (v. 20-22a) was used with the three selected input variables and the output variable; likewise, one or more hidden layers were considered with a variation of neurons per layer, plus various backpropagation training functions, transfer functions, and one or more lags on the input and output variables. On the other hand, the number of epochs was set equal to 100, the learning rate at 0.01, the momentum rate at 0.9, and the desired error of no more than 1%.
Next, the NARX model established in the design is trained using the training dataset. An iterative process in two loops has been considered to determine the values of the hyperparameters that provide the best results for the model. The first is to tune the hyperparameters in the training and validation process (Step P7), and the second for the testing process (Step P9). In each loop, the number of neurons has been varied from 1 to 30 for each hidden layer. Tests have been carried out with different backpropagation training functions and a combination of other transfer functions for the hidden and output layers. In addition, 1 to 4 delays have been considered.
The NARX model, obtained by the training and validation processes and fine-tuning, is tested with the Testing dataset to know the model's performance through the metrics (Step P9). Steps P6-P9 were repeated until a model was obtained that provided satisfactory results in the validation process.
After carrying out all the experiments, the NARX model was finally obtained ( Figure  9), which we will call optimal NARX, with the hyperparameters:  The NARX model, obtained by the training and validation processes and fine-tuning, is tested with the Testing dataset to know the model's performance through the metrics (Step P9). Steps P6-P9 were repeated until a model was obtained that provided satisfactory results in the validation process.
After carrying out all the experiments, the NARX model was finally obtained (Figure 9), which we will call optimal NARX, with the hyperparameters:   Figure 11 shows the regression graph, which indicates the relationship between the NARX network's output and the system's output (objective). The R-value is an indication of the relationship between outputs and goals. As the figure shows, the R values for all    Figure 11 shows the regression graph, which indicates the relationship between the NARX network's output and the system's output (objective). The R-value is an indication of the relationship between outputs and goals. As the figure shows, the R values for all plots are very close to 1. Therefore, the result for each training, validation, and testing data set indicates a good fit. Figure 12 shows the behavior of the real angular velocity (blue) and that obtained by the optimal NARX model (red) for the GTL.     Figure 13a shows the angular velocity variations for the Testing and the optimal NARX between 3595 and 3605 rpm. It should be noted that the expected angular speed is 3600 rpm. Figure 13a shows that between 2100 and 2250 min, a transient occurs in the turbine load, which was controlled automatically. Likewise, Figure 13b shows the relative error (RE).

Conclusions
This article proposes a systematic method for designing and developing black box models based on NARX neural networks to identify a robust, accurate, and reliable GT model. The proposal consists of 9 well-defined steps that go from identifying GT variables to obtaining a satisfactory NARX model.

Conclusions
This article proposes a systematic method for designing and developing black box models based on NARX neural networks to identify a robust, accurate, and reliable GT model. The proposal consists of 9 well-defined steps that go from identifying GT variables to obtaining a satisfactory NARX model.
In order to validate the proposed method, a case study was carried out using a dataset of 2305 records obtained by measuring the input and output variables of a SIEMENS brand gas turbine, model SGT6-5000F of 215 MW, located in Peru, called GTL. A computer program code was generated and executed in the MATLAB environment (v. 20-22a), following the proposed method step by step, systematically achieving a NARX model for the GTL. The training, validation, and testing results of the generated NARX model did not present significant deviations between the simulated and measured data. The values of the MSE, RMSE, and MAPE metrics during the training, validation, and testing phases were satisfactory, less than 10 −5 ; 0.5% and 0.06, respectively, values that are very competitive compared to other similar cases.
The case study shows that the proposed method for the development of a NARX model is adequate to identify a GT and to predict with high precision its output parameters based on the changes in the system inputs. Furthermore, since the method is indifferent to the type of gas turbine, it can be applied to predict the behavior of similar gas turbine systems with high accuracy.
This applied research work has as future activities the development of advanced control systems using artificial intelligence techniques. In addition, new neural network architectures can be considered, such as LSTM.

Data Availability Statement:
The private data presented in this study are available on request from the first author.

Conflicts of Interest:
The authors declare no conflict of interest.